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Abstract. KuUback-Leibler relative-entropy, in cases involving distributions result- 
ing from relative-entropy minimization, has a celebrated property reminiscent of 
squared Euclidean distance: it satisfies an analogue of the Pythagoras' theorem. And 
hence, this property is referred to as Pythagoras' theorem of relative-entropy mini- 
mization or triangle equality and plays a fundamental role in geometrical approaches 
of statistical estimation theory like information geometry. Equvalent of Pythagoras' 
theorem in the generalized nonextensive formalism is established in (Dukkipati at 
el., Physica A, 361 (2006) 124-138 [Ij). In this paper we give a detailed account 
of it. 
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1. Introduction 

Apart from being a fundamental measure of information, Kullback-Leibler relative- 
entropy or KL-entropy plays a role of 'measure of the distance' between two probability 
distributions in statistics. Since it is not a metric, at first glance, it might seem that 
the geometrical interpretations that metric distance measures provide usually might 
not be possible at all with the KL-entropy playing a role as a distance measure on a 
space of probability distributions. But it is a pleasant surprise that it is possible to 
formulate certain geometric propositions for probability distributions, with the relative- 
entropy playing the role of squared Euclidean distance. Some of these geometrical 
interpretations cannot be derived from the properties of KL-entropy alone, but from 
the properties of "KL-entropy minimization"; restating the previous statement, these 
geometrical formulations are possible only when probability distributions resulting from 
ME-prescriptions of KL-entropy are involved. 

As demonstrated by KuUback [2], minimization problems of relative-entropy with 
respect to a set of moment constraints find their importance in the well known Kullback's 
minimum entropy principle and thereby play a basic role in the information-theoretic 
approach to statistics [3111]. They frequently occur elsewhere also, e.g., in the theory of 
large deviations [5], and in statistical physics, as maximization of entropy [6l[7]. 

Kullback's minimum entropy principle can be considered as a general method of 
inference about an unknown probability distribution when there exists a prior estimate 
of the distribution and new information in the form of constraints on expected values [8] . 
Formally, one can state this principle as: given a prior distribution r, of all the 
probability distributions that satisfy the given moment constraints, one should choose 
the posterior p with the least relative-entropy. The prior distribution r can be a reference 
distribution (uniform, Gaussian, Lorentzian or Boltzmann etc.) or a j^nor estimate of p. 
The principle of Jaynes maximum entropy is a special case of minimization of relative- 
entropy under appropriate conditions |9j. 

Many properties of relative-entropy minimization just reflect well-known properties 
of relative-entropy but there are surprising differences as well. For example, relative- 
entropy does not generally satisfy a triangle relation involving three arbitrary probability 
distributions. But in certain important cases involving distributions that result from 
relative-entropy minimization, relative-entropy results in a theorem comparable to the 
Pythagoras' theorem cf. [lO] and pTl § 11]. In this geometrical interpretation, relative- 
entropy plays the role of squared distance and minimization of relative-entropy appears 
as the analogue of projection on a sub-space in a Euclidean geometry. This property is 
also known as triangle equality [8]. 



The main aim of this paper is to study the possible generalization of Pythagoras' 
theorem to the nonextensive case. Before we take up this problem, we present the 
properties of Tsallis relative-entropy minimization and present some differences with 
the classical case. In the representation of such a minimum entropy distribution, we 
highlight the use of the q-product (g-deformed version of multiplication), an operator 
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that has been introduced recently to derive the mathematical structure behind the 
Tsallis statistics. Especially, q'-product representation of Tsallis minimum relative- 
entropy distribution will be useful for the derivation of the equivalent of triangle 
equality for Tsallis relative-entropy. We mention here that a general class of relative- 
entropy functionals which satisfy Pythagorean relation is established by Griinwald and 
Dawid [12]. Recently a Pythagoras' theorem for a version of Renyi relative entropy is 
reported by Sundaresan [TBJ. 

Before we conclude this introduction on geometrical ideas of relative-entropy 
minimization, we make a note on the other geometric approaches. One approach is 
that of Rao [14], where one looks at the set of probability distributions on a sample 
space as a differential manifold and introduce a Riemannian geometry on this manifold. 
This approach is pioneered by Cencov [H] and Amari flS] who have shown the existence 
of a particular Riemannian geometry which is useful in understanding some questions 
of statistical inference. This Riemannian geometry turns out to have some interesting 
connections with information theory and as shown by Campbell [I6], with the minimum 
relative-entropy. In this approach too, the above mentioned Pythagoras' Theorem plays 
an important role [171 pp.72]. 

The other idea involves the use of Hausdorff dimension [181 IIH] to understand 
why minimizing relative-entropy should provide useful results. This approach was 
begun by Eggleston [20] for a special case of maximum entropy and was developed 
by Campbell [21]. For an excellent review on various geometrical aspects associated 
with minimum relative-entropy one can refer to [22] . 

The structure of this paper is organized as follows. We present the necessary 
background in § [2l where we discuss properties of relative-entropy minimization in the 
classical case. In § [3], we present the ME prescriptions of Tsallis relative-entropy and 
discuss its differences with the classical case. Finally, the derivation of Pythagoras' 
theorem in the nonextensive case is presented in § |H 

Regarding the notation, we define all the information measures on the measurable 
space {X, Tl). The default reference measure is /i unless otherwise stated. For simplicity 
in exposition, we will not distinguish between functions differing on a /x-nuU set only; 
nevertheless, we can work with equations between 971-measurable functions on X if they 
are stated as being valid only /i-almost everywhere (/i-a.e or a.e). Further we assume 
that all the quantities of interest exist and also assume, implicitly, the u-finiteness of /x 
and ;Lt- continuity of probability measures whenever required. Since these assumptions 
repeatedly occur in various definitions and formulations, these will not be mentioned 
in the sequel. With these assumptions we do not distinguish between an information 
measure of pdf p and that of the corresponding probability measure P - hence when 
we give definitions of information measures for pdfs, we also use the corresponding 
definitions of probability measures as well, wherever convenient or required - with the 
understanding that P{E) = J^pdfi, and the converse holding as a result of the Radon- 
Nikodym theorem, with P = In both the cases we have P <^ n. 

Note that though results presented in this paper do not involve major measure 
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theoretic concepts, we write all the integrals with respect to the measure /i, as a 
convention; these integrals can be replaced by summations in the discrete case or 
Lebesgue integrals in the continuous case. 



2. Relative-Entropy Minimization in the Classical Case 

KuUback's minimum entropy principle can stated formally as follows. Given a prior 
distribution r with a finite set of moment constraints of the form 

Ura{x)p{x) dfi^x) = {Um) , 171=1,..., M , (l) 



X 

one should choose the posterior p which minimizes the relative-entropy 

Iip\\r)= I p{x)\n^-^d^^{x) . (2) 

In ([1]), {um), m = 1, . . . , M are the known expectation values of £DT-measurable functions 
Um '■ X ^ M., m = 1, . . . , M respectively. 

With reference to ([2]) we clarify here that, though we mainly use expressions 
of relative-entropy defined for pdfs in this paper, we use expressions in terms of 
corresponding probability measures as well. For example, when we write the Lagrangian 
for relative-entropy minimization below, we use the definition of relative-entropy 

[ / In ^ dP if F < , 

m\R) = (3) 

+00 otherwise, 
for probability measures P and R, corresponding to pdfs p and r respectively. This 
correspondence between probability measures P and R with pdfs p and r, respectively, 
will not be described again in the sequel. 

2.1. Canonical Minimum Entropy Distribution 

To minimize the relative-entropy ([2]) with respect to the constraints ([T]), the Lagrangian 
turns out to be 

C{x,X,(3) = j ln^(x)dP(x) + A dP(x) - 1^ 

)dP{x)-{uS^ , (4) 
where A and (3m, m = 1, . . . M are Lagrange multipliers. The solution is given by 

^( 

dR^ 



dP *^ 
In — (x) + A + ^ (3mUm{x) = 

m=l 

and the solution can be written in the form of 



HP Em = l /3m Mm (x) 



dR 



X 
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Finally, from the posterior distribution p{x) = ^ given by Kullback's minimum 
entropy principle can be written in terms of the prior r(x) = 4^ as 



p{x) = ^ 



where 

r(a;)e"^"=i^™"'"^^M^(a;) (7) 



X 



is the partition function. 

Relative-entropy minimization has been applied to many problems in statistics [2] 
and statistical mechanics |23j. The other applications include pattern recognition |24j . 
spectral analysis [25], speech coding [26], estimation of prior distribution for Bayesian 
inference [27] etc. For a list of references on applications of relative-entropy minimization 
see [9] and a recent paper [28] . 

Properties of relative-entropy minimization have been studied extensively and 
presented by Shore [S] . Here we briefly mention a few. 

The principle of maximum entropy is equivalent to relative-entropy minimization in 
the special case of discrete spaces and uniform priors, in the sense that, when the prior 
is a uniform distribution with finite support W (over E G X), the minimum entropy 
distribution turns out to be 

P{^) = , (8) 

which is in fact, a maximum entropy distribution of Shannon entropy with respect to 
the constraints ([T]). 

The important relations to relative-entropy minimization are as follows. Minimum 
relative-entropy, /, can be calculated as 

M 

I = -In Z-J2Pm{Um) , (9) 
m=l 

while the thermodynamic equations are 
d 

lnZ=-{um), m = l,...M, (10) 



and 

dl 



d{Um) 

2.2. Pythagoras' Theorem 



Pm , m = l,...M. (11) 



The statement of Pythagoras' theorem of relative-entropy minimization can be 
formulated as follows [TO] . 
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Theorem 2.1. Let r be the prior, p be the probability distribution that minimizes the 
relative- entropy subject to a set of constraints 

Um{x)p{x) d/i(x) = (Mm) , m = 1, . . . , M , (12) 



X 



with respect to Tl-measurable functions Um '■ X ^ R, m = 1, . . . M whose expectation 
values (um), rn = 1,...M are (assumed to be) a priori known. Let I be any other 
distribution satisfying the same constraints [W) . then we have the triangle equality 

I{l\\r)=I{l\\p)+I{p\\T) . (13) 

Proof. We have 

I(l\\r) = [ /(a;)ln^d^(x) 
Jx r{x) 

/(x)ln44d^(^)+ / l{x)ln^dfi{x) 



X 



p{x) Jx r{x) 



I{l\\p)+ I l{x)\n^dfx{x) (14) 
r[x) 



X 



From the minimum entropy distribution (l6l) we have 

M 

r(x) 

^ ' m=l 

By substituting (ITS!) in (fT4|) we get 



ln^ = -y2f3mUm{^)-lnZ . (15) 



I{l\\r) = I{l\\p) + jj{x) l^-J2^(3muUx) -\nZ^ d/i(x) 
= /(/||p) - ^^/3m jy l{x)u^{x)djj,{x)^ -\nZ 



M 



H^Ip) - f3m{um) - In Z (By hypothesis) 



771=1 



I{l\\p) + I{p\\r) . {Bjm 



□ 



A simple consequence of the above theorem is that 

/(/||r) > I{p\\r) (16) 

since /(^||p) > for every pair of pdfs, with equahty if and only ii I = p. 

Detailed discussions on the importance of Pythagoras' theorem of relative-entropy 
minimization can be found in [8] and p/Q pp. 72]. For a study of relative-entropy 
minimization without the use of Lagrange multiplier technique and corresponding 
geometrical aspects, one can refer to [TO] . 

Pythagorean realtion of relative-entropy minimization not only plays a fundamental 
role in geometrical approaches of statistical estimation theory [H] and information 
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geometry [T5| but is also important for applications in which relative-entropy 
minimization is used for purposes of pattern classification and cluster analysis |24j . 

3. Tsallis Relative-Entropy Minimization 

Unlike the generalized entropy measures, ME of generalized relative-entropies is not 
much addressed in the literature. Here, one has to mention the work in [30], where the 
minimum relative-entropy distribution of Tsallis relative-entropy with respect to the 
constraints in terms of (/-expectation values is given. 

In this section, we study several aspects of Tsallis relative-entropy minimization. 
First we derive the minimum entropy distribution in the case of ^-expectation values (see 
f fTSj) ) and then in the case of normalized g-expectation values (see fl35|l ). We propose an 
elegant representation of these distributions by using g-deformed binary operator called 
g-product ®q. This operator is defined in [31] along similar hues as g-addition Q)q and 
g-subtraction 0^. Since g-product plays an important role in nonextensive formalism, 
we include a detailed discussion on the g-product in this section. Finally, we study 
properties of Tsallis relative-entropy minimization and its differences with the classical 
case. 

3.1. Generalized Minimum Relative-Entropy Distribution 
To minimize Tsallis relative-entropy 



with respect to the set of constraints specified in terms of g-expectation values 




(17) 




(18) 



the concomitant variational principle is given as follows: Define 




(19) 



where A and P, 



m = 1 



. M are Lagrange multipliers. Now set 



The solution is given by 




(20) 




M 



A 




m=l 



which can be rearranged by using the definition of g-logarithm In^ x 



1-q 



as 



r{x) 



1-q 



1-9) T.m=l f^niUmix) 



1 

1-9 



p{x) 

(A(l-g) + l)^ 

Specifying the Lagrange parameter A via the normahzation J^p{x) dfi{x) = 1, one can 
write Tsalhs minimum relative-entropy distribution as 



M 



r(x 



.1-9 



p{x) 



where the partition function is given by 



m=l 



1 

1-q 



(21) 



X 



M 



r[x 



m=l 



1 

1-9 



d/i(a;) . 



(22) 



The values of the Lagrange parameters /3m, m = 1, . . . ,M are determined using the 
constraints (IT5I) . 



3.2. q-Product Representation for Tsallis Minimum Entropy Distribution 

Note that the generalized relative-entropy distribution fl2Tl) is not of the form of its 
classical counterpart (jH]) even if we replace the exponential with the ^-exponential. But 
one can express (12T!) in a form similar to the classical case by invoking q'-deformed binary 
operation called g-product. 

In the framework of g-deformed functions and operators a new multiplication, called 
g-product defined as 

{(x^-'^ + yi-"-!)^ ifx,2/>0, 
^i-q ^ yi-q _ 1 > (23) 
otherwise. 

This is first introduced in [32] and explicitly defined in [31] for satisfying the following 
equations: 

lng(x ®q y) = \ngX + In^ y , (24) 



eq ®g 



q ■ (25) 

The g-product recovers the usual product in the limit g — > 1 i.e., limg^i(z (8>g y) = xy. 
The fundamental properties of the g-product are almost the same as the usual 
product, and the distributive law does not hold in general, i.e., 

a{x ®qy) ^ ax ®qy (a,a;,2/GM) . 

Further properties of the g-product can be found in 
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One can check the mathematical vahdity of the g-product by recalhng the expression 
of the exponential function e^' 

= lim fl + . (26) 



Replacing the power on the right side of fl2B]) by n times the g-product ®g: 

X^"" = X®q...®qX , (27) 



n times 



one can verify that [33j 



e^=lim(l + -) . (28) 



n—*oo \ n 



Further mathematical significance of g-product is demonstrated in ^34j by discovering 
the mathematical structure of statistics based on the Tsallis formalism: law of error, q- 
Stirling's formula, g-multinomial coefficient and experimental evidence of g-central limit 
theorem. 

Now, one can verify the non-trivial fact that Tsallis minimum entropy distribution 
f l2T]) can be expressed as [35] . 



r(x) Q$ig 



p{x) = ' ' " , (29) 



where 



^g= r(a;) ®,e-^-=i'^"'"'"('^)d/i(x). (30) 
Jx 

Later in this paper we see that this representation is useful in establishing properties of 
Tsallis relative-entropy minimization and corresponding thermodynamic equations. 

It is important to note that the distribution in fl^Tl) could be a (local/global) 
minimum only if g > and the Tsallis cut-off condition specified by Tsallis maximum 
entropy distribution is extended to the relative-entropy case i.e., p{x) = whenever 
r(x)^~'' — (1 — g) X]m=i l3mUm{x) < 0. The latter condition is also required for the 
g-product representation of the generalized minimum entropy distribution. 

In this case, one can calculate minimum relative-entropy Ig as 

M 

Iq = - lUg - ^ Pm{Um)g ■ (31) 
m=l 

To demonstrate the usefulness of g-product representation of generalized minimum 
entropy distribution we present the verification fl3T|) . By using the property of g- 
multiplication fl2Sl) . Tsallis minimum relative-entropy distribution fl2^ can be written 

as 

g 



p{x)Zg = e^^"=i'^'""'"('^^+''^'''^(^') 
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By taking g-logarithm on both sides, we get 

M 

lngp{x) + lug Zq + (1 - q) \ngp{x) lug Zq = - ^ f5mUm{x) + liig r{x) 

m=l 

By the property of ^'-logarithm In^ j = ?/^^^(lng x — In^ y), we have 



K ^ = Pi^y <! In, + (1 - g) In, p{x) In, + ^ A 



m=l 



(32) 

By substituting fl32l) in Tsalhs relative-entropy f|T7|) we get 

Iq = - p{xY I In, Z, + (1 - g) ln,^»(x) In, Z, + /^^^^(x) d/i(x) 

I m=l 



By (fTSj) and expanding ln,p(x) one can write /, in its final form as in fl3T|) . 

It is easy to verify the following thermodynamic equations for the minimum Tsallis 
relative-entropy: 

d — 

— — ln,Z, = , m = l,...M, (33) 
81 

—Ji- = -P^ , m=l,...M, (34) 
which generalize thermodynamic equations in the classical case. 
3.3. The Case of Normalized q-Expectations 

In this section we discuss Tsallis relative-entropy minimization with respect to the 
constraints in the form of normalized g-expectations 

Umix)p(xY dLi(x) ,, , , 

J,p(i).dV) (35) 
The variational principle for Tsallis relative-entropy minimization in this case is as 
below. Let 



C{x, X, (3) 



/^Kr(|dPW-A(/^dPW-i) 

M . „ 

m=l ^-^^ 



where the parameters l3m can be defined in terms of the true Lagrange parameters l3m 
as 

4^^ = ^-^^^ , m = l,...,M. (37) 



p{xy dfi{x) 



X 
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This gives minimum entropy distribution as 



p{x) 



r{xY ^ — (1 — 



1-9 



(38) 



where 



X 



r{xy ^ — (1 — g)- 



J^p{xydti{x) 

Now, the minimum entropy distribution (138|) can be expressed using the g-product (l23l) 
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dyU(x) . 



as 



f^p{xYdfi{x) 



Minimum TsaUis relative-entropy Ig in this case satisfies 

Iq — ~ ll^ij 1 

while one can derive the following thermodynamic equations: 

d ^ 

lUqZq = m=l,...M, 



dL, 



d{{Um))r 



-Prn , m = 1,...M, 



where 



M 



lUg Zg = lUg - ^ /^m 5 



(39) 
(40) 

(41) 
(42) 

(43) 



m=l 



4. Nonextensive Pythagoras' Theorem 



With the above study of TsaUis relative-entropy minimization, in this section, we present 
our main result, Pythagoras' theorem or triangle equality (Theorem 12.11) generalized to 
the nonextensive case. To present this result, we shall discuss the significance of triangle 
equality in the classical case. We restate Theorem 12 . 1 1 which is essential for the derivation 
of the triangle equality in the nonextensive framework. 
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4-1- Pythagoras' Theorem Restated 

Significance of tlie triangle equality is evident in tlie following scenario. Let r be the 
prior estimate of the unknown probability distribution /, about which, the information 
in the form of constraints 

Um{x)l{x) dfi{x) = {um) , m = 1, . . . M (44) 

X 

is available with respect to the fixed functions Um, m = 1, . . . , M. The problem is to 
choose a posterior estimate p that is in some sense the best estimate of / given by the 
available information i.e., prior r and the information in the form of expected values 
dH]). KuUback's minimum entropy principle provides a general solution to this inference 
problem and provides us the estimate (l6l) when we minimize relative-entropy /(p||r) with 
respect to the constraints 

Um{x)p{x) dii{x) = {um) , m = 1, . . . M . (45) 

X 

This estimate of posterior p by Kullback's minimum entropy principle also offers 
the relation (Theorem 12. ip 

mr)=I{l\\p) + I{p\\r) , (46) 

from which one can draw the following conclusions. By (fT6|) . the minimum relative- 
entropy posterior estimate of I is not only logically consistent, but also closer to /, in 
the relative-entropy sense, that is the prior r. Moreover, the difference /(/||r) — I{l\\p) is 
exactly the relative-entropy /(p||r) between the posterior and the prior. Hence, /(p||r) 
can be interpreted as the amount of information provided by the constraints that is not 
inherent in r. 

Additional justification to use minimum relative-entropy estimate of p with respect 
to the constraints fH5|) is provided by the following expected value matching propertj [8J. 
To explain this concept we restate our above estimation problem as follows. 

For fixed functions Um, m = 1, . . . M, let the actual unknown distribution I satisfy 

Um{x)l{x) dfi{x) = (wm) , m = 1, . . . M, (47) 

X 

where (wm), m = 1, . . . M are expected values of /, the only information available about 
/ apart from the prior r. To apply minimum entropy principle to estimate posterior 
estimation p of I, one has to determine the constraints for p with respect to which we 
minimize /(p||r). Equivalently, by assuming that p satisfies the constraints of the form 
( l45l) . one has to determine the expected values (um), m = 1, . . . , M. 

Now, as ("Um), rn = 1,...,M vary, one can show that Iq{l\\p) has the minimum 
value when 

{Um) = {Wm) , m= 1,...M. (48) 
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The proof is as follows [S] . Proceeding as in the proof of Theorem I2.H we have 



M . 



l{x)um{x) d/i(x) > + In Z 



M 

= I{l\\r) + Y,Pm{wm)+\nZ (BydSD) (49) 

m=l 

Since the variation of I{l\\p) with respect to (um) results in the variation of I{l\\p) with 
respect to (3rn for any m = 1, . . . , M, to find the minimum of I{l\\p) one can solve 

-^/,(/|b) = , m = l,...M , 

which gives the solution as in (l48l) . 

This property of expectation matching states that, for a distribution p of the form 
(Q, I{l\\p) is the smallest when the expected values of p match those of I. In particular, 
p is not only the distribution that minimizes but also minimizes I{l\\p). 

We now restate the Theorem 12.11 which summarizes the above discussion. 

Theorem 4.1. Let r be the prior distribution, and p be the probability distribution that 
minimizes the relative- entropy subject to a set of constraints 



X 



Um{x)p{x) diJL{x) = {Um) , m = 1, . . . , M. (50) 
Let I be any other distribution satisfying the constraints 

/ Ura{x)l{x) dn{x) = {Wjn) , m = 1, . . . , M. (51) 

J X 

Then 

(i) Ii{l\\p) is minimum only if (expectation matching property) 

{Um) = {wm) , m=l,...M. (52) 

(a) When [5^) holds, we have 

I{l\\r) = I{l\\p) + L{p\\r) (53) 

By the above interpretation of triangle equality and analogy with the comparable 
situation in Euclidean geometry, it is natural to call p, as defined by ([6]) as the projection 
of r on the plane described by ( ISTl) . Csiszar [10] has introduced a generalization of this 
notion to define the projection of r on any convex set £ of probability distributions. If 
p E £ satisfies the equation 

J(p||r) = min /(s||r) , (54) 

then p is called the projection of r on £. Csiszar [10] develops a number of results about 
these projections for both finite and infinite dimensional spaces. In this paper, we will 
not consider this general approach. 
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4-2. The Case of q-Expectations 

From the above discussion, it is clear that to derive the triangle equality of Tsallis 
relative-entropy minimization, one should first deduce the equivalent of expectation 
matching property in the nonextensive case. 

We state below and prove the Pythagoras theorem in nonextensive framework 
established by Dukkipati et al. [1]. 

Theorem 4.2. Let r be the prior distribution, and p be the probability distribution that 
minimizes the Tsallis relative- entropy subject to a set of constraints 



X 



u„i{x)p{xy dfiix) = {um)g , m = l,...,M. (55) 
Let I be any other distribution satisfying the constraints 

Um{x)l{xy dfj.{x) = {Wm)q , m = l,...,M. (56) 



X 



Then 

(i) Iq{l\\p) is minimum only if 



(O, = ^^T^-(t7^ ' m = l,...M. (57) 



l-{l-q)Iq{l\\p) 



(a) Under ( [57| j, we have 

iMr) = IMp) + iMr) + iq- l)lMp)W\r) ■ (58) 

Proof. First we deduce the equivalent of expectation matching property in the 
nonextensive case. That is, we would like to find the values of {um)q for which Iq{l\\p) is 
minimum. We write the following useful relations before we proceed to the derivation. 
We can write the generalized minimum entropy distribution (l29i) as 

p^^) = !^ = ^ _ , (59) 

Zq Zq 

by using the relations elf'^ = x and ®q = e^"*"^. Further by using 

hiq{xy) = \nqX + In^ y + {1 - q) lUg x In^ y 
we can write ( !59|) as 

M 

liaqp{x) + In, Zq + {1~ q) lnqp{x) In, Zq = - ^ PmUm{x) + \nq r{x) .(60) 

m=l 

By the property of g-logarithm 

ln,(^^^ =y'^~\\nqx-\nqy) , (61) 

and by g-logarithmic representations of Tsallis entropy, 

Sq = - p{x)''\nqp{x)diJ,{x) , 
Jx 
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one can verify that 



p{xY \n.q r{x) d/i(x) — Sq{j>) . 



(62) 



X 



With these relations in hand we proceed with the derivation. Consider 

m\p) = - 



X 



l{x) lng^^d/x(x) 



By (1611) we have 



l{xy \\lqp{x) —hlql{x) dn{x) 

-^g(^lk) ~ / \nqp{x) — \nqr{x) d^{x) 



(63) 



X 



From fl60l) . we get 



X 



M 



m=l 



d//(x) 



+ \nqZq / l{xy dfj,{x) 



X 



+ {l-q)\nqZq l{xY\nqp{x)dfi{x) 



(64) 



X 



By using ( 1561) and ([62 



M 



Iq{l\\p)=Iq{l\\r) + f2(3m{Wm)q+\nqZq f /(x)M^(: 
m=l "''^ 

+ (l-q)lnqZq[-Iq{l\\p)-Sqil) 

and by the expression of Tsallis entropy Sq{l) = [l — l{xY d/i(x)] , we have 



(65) 



M 



Iq{l\\p) = Iq{l\\r) + J2 Pm{Wm)q + In, Z; - (1 - g) In, Zqlq{l\\p) . (66) 



m=l 



Since the multipliers /3m, m = 1,...M are functions of the expected values (wm), 
variations in the expected values are equivalent to variations in the multipliers. Hence 
to find the minimum of Iq{l\\p), we solve 

d 



-m\p) = 



(67) 



By using thermodynamic equation (l33l) . solution of (1671) provides us with the 
expectation matching property in the nonextensive 

{Wm)q 



m = l,...M 



(68) 



l-{l-q)^l\\p) 

In the limit q ^ 1 the above equation gives (Mm)i = {wm)i which is the expectation 
matching property in the classical case. 
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Now, to derive the triangle equality for Tsallis relative-entropy minimization, we 
substitute the expression for {wm)g, which is given by fl68l) . in fl66l) . And after some 
algebra one can arrive at (l58l) . □ 

Note that the limit g — > 1 in fl58l) gives the triangle equality in the classical case fl53|) . 
The two important cases which arise out of (!58l) are, 

Iq{l\\r) < IMp) + iMr) when 0<q<l , (69) 
/g(Z||r) > IMp) + iMr) when Kq . (70) 

We refer to Theorem 14.21 as nonextensive Pythagoras' theorem and fl58l) as 
nonextensive triangle equality, whose pseudo-additivity property is consistent with the 
pseudo additivity of Tsallis relative-entropy 

J,(Xi X Yi\\X2 X Y2) = /,(Xi||X2) + 1,(^111^2) 

+ {q-l)I,{X4X2)I,{Y4Y2) , (71) 

where Xi,X2 and 1^,12 are r.vs such that Xi and Yi are independent, and X2 and 
Y2 are independent respectively; hence is a natural generalization of triangle equality in 
the classical case. 



4-3. In the Case of Normalized q-Expectations 

In the case of normalized g-expectation too, the Tsallis relative-entropy satisfies 
nonextensive triangle equality with modified conditions from the case of g-expectation 
values [H [36j . 

Theorem 4.3. Let r be the prior distribution, and p be the probability distribution that 
minimizes the Tsallis relative- entropy subject to the set of constraints 

Umix)p(xY du(x) ,, , , 

J.pwldU) ° (72) 

Let I be any other distribution satisfying the constraints 

Umix)l(xY dii(x) ,, , , 

jj(i)ld,.M 

Then we have 

iMr) = IMp) + iMr) + iq- l)/,(/|b)/,(p||r) , (74) 

provided 

{{um)), = {{wm)),m = l,...M. (75) 
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Proof. From Tsallis minimum entropy distribution p in the case of normalized q- 
expected values fl39l) . we have 



lngr{x) - lnqp{x) =lnqZq+ {1- q) lnqp{x) lug Zg 



J^p{x)<idfi{x) 



+ 



Proceeding as in the proof of Theorem I4.2[ we have 

Iq{l\\p) = Iq{l\\r) - / l{xy \nqp{x) -\nqr{x) d^{x) . 



(76) 



(77) 



X 



From ( 1761) . we obtain 

/,(/|b) = /,(/||r)+ln,^ / l{xydfi{x) 

Jx 

+ {l-q)\nqTq l{xy\nqp{x)d^i{x) 



X 



+ 



1 M 

/^P(x)^dMx)5/'"A^^"^' ( 



Mm(a;) - ((Mm))J d^(x) . 



(78) 



By f l75]) the same can be written as 

/,(/|b) = J,(/||r) + ln,^ / l{xYdii{x) 



X 



+ {l-q)\nqZq l{xy \nq p{x) d^{x) 



X 



+ 



J^l{xydfi{x) 



M 



JxP{x)i> 



' m=l 



(79) 



By using the relations 

/ l{xy\nqp{x)dfi{x) = -Iq{l\\p) - Sq{l) , 

Jx 

and 

/ l{xydfi{x) = il-q)Sqil) + l , 
(179]) can be written as 

Iq{l\\p) = Iqil\\r) +\nq%-{l- q) \nq%iq{l\\p) 

j^l{xyd^{x) 



+ 



5^/?,n(((«^„^)),-(W),) . (80) 



j^p{xYd^{x) 

Finally using fHOj) and fl75|) we have the nonextensive triangle equality fTMl) . 



□ 
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Note that in this case the minimum of Iq{l\\p) is not guaranteed. Also the condition 
flTSjl for nonextensive triangle equality here is the same as the expectation value matching 
property in the classical case. 

5. Conclusions 

Phythagoras' theorem of relative-entropy plays an important role in geometrical 
approaches of statistical estimation theory like information geometry. In this paper we 
presented Pythagoras' theorem in the nonextensive case i.e., for Tsallis relative-entropy 
minimization. In our opinion, this result is yet another remarkable and consistent 
generalization shown by the Tsallis formalism. 

Now, equipped with the nonextensive Pythagoras' theorem in the generalized case 
of Tsallis, it is interesting to know the resultant geometry when we use generahzed 
information measures and role of entropic index in the geometry. 
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